Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
📊 IVF Indexes
Specific
Inverted File Index, Vector Clustering, Quantization, ANN Search
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
28933
posts in
120.7
ms
Evolutionary fine tuning of
quantized
convolution-based
deep learning models
🧠
LLM Inference
arxiv.org
·
4d
Qdrant
1.18 -
TurboQuant
🎯
Qdrant
qdrant.tech
·
1d
Fitting
Multilinear
Polynomials
for Logic Gate Networks
📊
Vector Databases
arxiv.org
·
11h
Pretraining
large language models with
MXFP4
🔤
Tokenization
arxiv.org
·
11h
RateQuant
: Optimal Mixed-Precision KV Cache Quantization via
Rate-Distortion
Theory
🔬
RaBitQ
arxiv.org
·
1d
OSAQ
:
Outlier
Self-Absorption for Accurate Low-bit LLM Quantization
🔬
RaBitQ
arxiv.org
·
5d
Fitting Is Not Enough:
Smoothness
in Extremely
Quantized
LLMs
🧠
LLM Inference
arxiv.org
·
11h
Amortized-Precision
Quantization
for Early-Exit Vision Transformers
🎯
Vector Quantization
arxiv.org
·
1d
RDKV
: Rate-Distortion Bit Allocation for Joint
Eviction
and Quantization of the KV Cache
🔬
RaBitQ
arxiv.org
·
11h
Taming
the Entropy Cliff: Variable
Codebook
Size Quantization for Autoregressive Visual Generation
🔬
RaBitQ
arxiv.org
·
4d
Quantizing
With Randomized
Hadamard
Transforms: Efficient Heuristic Now Proven
🔬
RaBitQ
arxiv.org
·
4d
HeadQ
: Model-Visible
Distortion
and Score-Space Correction for KV-Cache Quantization
🔬
RaBitQ
arxiv.org
·
6d
eOptShrinkQ
:
Near-Lossless
KV Cache Compression Through Optimal Spectral Denoising and Quantization
🗜️
Vector Compression
arxiv.org
·
6d
Quantized
Probabilistic
AI for Gear Fault Diagnosis in Motor Drives
🧠
LLM Inference
arxiv.org
·
5d
Hardware-Aware Neural Feature
Extraction
for
Resource-Constrained
Devices
📦
Batch Embeddings
arxiv.org
·
5d
EdgeRazor
: A Lightweight Framework for Large Language Models via Mixed-Precision
Quantization-Aware
Distillation
🔢
BitNet
arxiv.org
·
5d
Topology-Constrained Quantized
nnUNet
for Efficient and
Anatomically
Accurate 3D Tooth Segmentation
0
Binary Vector Embeddings
arxiv.org
·
5d
PACZero
: PAC-Private Fine-Tuning of Language Models via Sign
Quantization
🔢
BitNet
arxiv.org
·
4d
When Quantization Is Free: An
int4
KV Cache That
Outruns
fp16 on Apple Silicon
🖥️
Hardware Architecture
arxiv.org
·
4d
Saliency-Aware
Regularized
Quantization Calibration for Large Language Models
🧠
LLM Inference
arxiv.org
·
4d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help